Fast  Constant  Weight  Codeword  to  Index 

Converter 

J.  T.  Butler  T.  Sasao 


Department  of  Electrical  and  Computer  Engineering 
Naval  Postgraduate  School 
Monterey,  CA  U.S.A. 

Abstract — Constant  weight  codewords,  in  which  the  number 
of  l’s  is  fixed,  are  essential  to  many  coding  applications.  In  this 
paper,  we  show  an  efficient  circuit  that  converts  a  constant  weight 
codeword  into  a  unique  index  of  that  codeword.  For  example, 
this  circuit  is  necessary  when  constant  weight  codewords  are 
used  to  transmit  data  on  and  off  chip.  Our  circuit  is  based  on 
the  combinatorial  number  system  in  which  the  digits  are  binomial 
coefficients.  It  has  0(n3)  area  complexity  and  O(n)  delay,  where 
n  is  the  number  of  variables.  Two  types  of  circuits  are  proposed. 
Various  constant  weight  codes  are  implemented  on  an  FPGA, 
including  a  64-out-of-128  code.  These  implementations  support 
our  complexity  analysis. 
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I.  Introduction 

A  constant  weight  code  to  index  converter  is  needed  when 
constant  weight  Gray  codes  are  used  to  encode  data  in  flash 
memory.  In  local  rank  modulation  [2],  data  stored  in  flash 
memory  is  viewed  as  an  n-bit  constant  weight  codeword  that 
differs  in  exactly  two  bits  from  an  adjacent  memory  location 
(because  of  overlap).  All  codewords  in  this  encoding  have  the 
same  weight  (number  of  l’s). 

Balanced  codes,  with  as  many  0’s  as  l’s,  can  be  used  to 
transfer  data  on  and  off  VLSI  chips  so  that  the  current  fluctu¬ 
ations  are  minimized  [7].  On  the  other  hand,  codes  with  small 
weight  are  desired  in  this  application  because  they  yield  faster 
and  more  compact  circuits  [7].  Constant  weight  codewords 
can  be  used  to  counter  “side-channel”  attacks  against  secure 
systems  [4],  Such  attacks  use  data  dependent  differences  in 
power  consumption  to  extract  hidden  information.  Constant 
weight  codewords  have  been  used  in  asynchronous  logic  to 
implement  delay-insensitive  codewords  [8]. 

The  use  of  constant  weight  codewords  requires  two  parts, 
an  index  to  constant  weight  code  converter  and  a  constant 
weight  code  to  index  converter.  We  considered  the  first  part 
in  [1],  However,  we  have  not  seen  a  hardware  implementation 
of  the  second  part,  except  for  an  implementation  that  requires 
0(2”)  complexity  [6],  In  this  paper,  we  propose  an  imple¬ 
mentation  with  0(n3)  complexity.  In  Section  II,  we  discuss 
the  combinatorial  number  system.  We  show  how  it  can  be 
used  to  convert  a  constant  weight  codeword  to  an  index,  and 
we  present  its  circuit  implementation.  Then,  in  Section  III,  we 
show  an  improvement  to  this  circuit  that  significantly  reduces 
delay  for  large  n.  Finally,  in  Section  IV,  we  give  concluding 
remarks. 


Department  of  Computer  Science  and  Electronics 
Kyushu  Institute  of  Technology 
Iizuka,  Fukuoka,  JAPAN 

II.  The  Combinatorial  Number  System 
A.  Introduction 

The  basis  for  our  constant  weight  code  to  index  converter 
is  the  combinatorial  number  system  [3], 

Definition  1.  In  an  (")  combinatorial  number  system  [5], 
an  integer  N  <  (”)  is  represented  as  N  =  crcr-\...c\, 
where 


and  cr  >  cr-i  >  . . .  >  C\  >  0. 

Example  1.  Table  I  shows  the  representation  of  integers  hi 
the  (3)  combinatorial  number  system.  The  leftmost  column 
shows  the  integer’s  value  in  decimal  and  its  vector  represen¬ 
tation.  The  middle  column  shows  how  this  value  is  computed 
according  to  (1).  The  rightmost  column  of  Table  I  shows  the 
corresponding  6  bit  constant  weight  code.  Note  that  the  three 
elements  of  the  vector  representation  shown  in  the  leftmost 
column  correspond  to  the  positions  of  the  1  ’s  in  the  constant 
weight  codeword.  For  example,  19  =  5  4  3  corresponds  to 
111000,  there  being  1’s  in  positions  5,  4,  and  3. 

(End  of  Example) 

B.  Circuit  Implementation 

A  major  contribution  of  this  paper  is  to  show  how  the 
combinatorial  number  system  can  be  used  to  realize  an  ef¬ 
ficient  circuit  that  transforms  a  constant  weight  codeword  to 
the  index  for  that  codeword.  Such  a  circuit  has  for  inputs  the 
values  of  the  rightmost  column  of  Table  I  (the  bits  of  the 
constant  weight  code)  and  has  as  outputs  the  standard  binary 
number  representation  of  the  numbers  shown  in  the  leftmost 
column  (the  values  of  the  index  N).  As  shown  in  Table  I,  the 
1-bits  in  the  constant  weight  code  contribute  a  value  to  its 
corresponding  index  depending  on  the  1  bit’s  position  in  the 
codeword.  For  example,  from  Table  I,  the  1  ’s  in  the  codeword 
111000  contribute  (3),  (2),  and  (3),  from  left  to  right.  This 
can  be  seen  in  Fig.  1,  which  shows  a  circuit  that  converts  a 
3-out-of-6  constant  weight  codeword  into  the  corresponding 
index  of  that  codeword. 
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TABLE  I 

The  (®)  Combinatorial  Number  System  for  0  <  N  <  19 


N 

Computing  the  Value  of  N 

for  r  =  3 

Const  Wght  Code 

543210 

19 

=  5 

4 

3 

(3) 

+  © 

+  (?) 

=  10  +  6  +  3 

111000 

18 

=  5 

4 

2 

(3) 

+  © 

+  © 

=  10  +  6  +  2 

110100 
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=  5 

4 

1 

© 

+  © 

+  © 

=  10  +  6  +  1 

110010 
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4 

0 

© 

+  © 

+  (?) 

=  10  +  6  +  0 
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15 

=  5 

3 

2 

© 

+  © 

+  (?) 

=  10  +  3  +  2 
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14 

=  5 

3 

1 

© 

+  © 

+  © 

=  10  +  3  +  1 

101010 

13 

=  5 

3 

0 

© 

+  © 

+  © 

=  10  +  3  +  0 

101001 

12 

=  5 

2 

1 

(3) 

+  © 

+  © 

=  10  +  1  +  1 

100110 

11 

=  5 

2 

0 

© 

+  © 

+  © 

=  10  +  1  +  0 

100101 

10 
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1 

0 

© 

+  © 

+  © 
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100011 

9 

4 

3 

2 

© 
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4 
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=1+1+1 

001110 

2 

3 

2 
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001101 
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001011 
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© 

+  © 
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=0+0+0 

000111 

Constant  Weight 


Fig.  2.  Decoder  and  Binomial  Constant  Generator. 


This  circuit  contains  an  array  of  decoders  that  control  which 
digits  occur  in  the  combinatorial  number.  Fig.  2  shows  the 
detail  of  the  decoders  and  the  tri-state  circuit  that  provides 
constants  for  the  combinatorial  number.  We  can  make  the 
following  observations. 

1)  If  a  1  occurs  on  either  or  both  inputs  to  the  OR  gate  in 
the  upper  left  hand  corner  of  the  decoder,  then  a  1  is 
produced  at  exactly  one  of  the  two  outputs.  Specifically, 
if  Xi ,  the  input  driving  the  decoder  is  1,  then  a  1  appears 
at  the  horizontal  output  (solid  line),  and  a  0  appears  at 
the  vertical  output  (dotted  line).  On  the  other  hand,  if  Xi 
is  0,  then  a  0  appears  at  horizontal  output  and  a  1  appears 
at  the  vertical  output.  However,  if  both  inputs  to  the  OR 
gate  in  the  upper  left  hand  corner  are  0,  then  both  the 
horizontal  and  vertical  outputs  produce  0. 

2)  There  is  a  path  of  l’s  through  the  array  of  decoders  in 
Fig.  1  beginning  at  the  upper  left  hand  corner.  The  path  is 
determined  by  the  values  of  Xi  in  the  constant  codeword. 
For  example,  if  x$x±X2,X2XiXo  =  111000,  then  the  three 
decoders  along  the  top  of  the  array  of  decoders  starting 
from  the  upper  left  hand  decoder  all  produce  1  at  their 
horizontal  outputs.  This  is  because  their  inputs,  .7; 5,  X4, 
and  X3,  are  1.  However,  the  decoder  in  the  upper  right 
hand  corner  is  driven  by  7:9,  which  is  0.  So,  the  1  at 
its  OR  gate  input  is  directed  now  to  its  vertical  output 
(while  its  horizontal  output  is  0).  Because  X\  and  x()  are 
both  0,  this  1  is  directed  downward  (along  dotted  lines) 
through  two  decoders  into  the  2-input  OR  gate  that  drives 
Valid.  That  is,  when  X3X^X3X2X\Xq  =  111000,  Valid 
is  1,  indicating  the  input  codeword  is  a  valid  3-out-of-6 
codeword. 

3)  All  other  valid  codewords  result  in  a  path  of  l’s  from  the 
upper  left  hand  corner  to  the  lower  right  hand  corner, 
causing  Valid  to  be  1.  Conversely,  a  non-codeword 
causes  Valid  to  be  0. 

4)  All  horizontal  lines  from  decoders  drive  binomial  co¬ 
efficient  generators  which  apply  to  one  of  three  bus 
lines  that  drive  inputs  of  an  adder  whose  output  is  the 
Index.  Specifically,  a  1  on  the  horizontal  line  causes  the 


corresponding  binomial  coefficient  generator  to  drive  its 
line.  A  0  disconnects  the  binomial  coefficient  generator. 
For  example,  in  the  case  of  x^XiX^X2X\Xo  =  111000,  the 
three  horizontal  lines  driven  by  decoders  cause  (3),  (2), 
and  (3)  to  be  applied  to  the  three  adder  inputs  resulting 
in  19  at  the  output,  which  is  the  index  of  111000. 

5)  As  shown  in  Fig.  1,  each  input  to  the  adder  is  driven  by 
four  binomial  coefficient  generators.  The  first  (leftmost)  1 
in  the  constant  weight  codeword  specifies  which  binomial 
coefficient  generator  drives  the  left  input  of  the  adder.  The 
second  1  determines  which  drives  the  middle  adder  input, 
and  the  third  (rightmost)  1  determines  which  drives  the 
right  adder  input. 

C.  Complexity  of  Implementation 

The  complexity  of  the  constant  weight  code  to  index 
converter,  is  dominated  by  the  array  of  binomial  coefficient 
generators  and  decoders.  This  array  is  a  rectangle  of  r  +  1  by 
n—r+1  cells,  for  a  total  of  (r+l)(n— r+1)  cells.  With  r  = 
the  (worst  case)  number  of  cells  is  0(n2).  The  decoder  has 
a  complexity  that  is  independent  of  n.  However,  the  binomial 
coefficient  generator  requires  0(n)  tri-state  buffers.  That  is, 
the  binomial  coefficient  with  the  most  tri-state  buffers  is  the 
one  in  the  upper  left  hand  corner;  it  realizes  ("f  which 
requires  no  more  that  0(n)  tri-state  buffers.  Thus,  the  total 
complexity  is  0(n3).  And  so,  the  constant  weight  codeword 
to  index  converter  has  complexity  polynomial  in  n.  Table 
II  shows  the  exact  number  of  tri-state  buffers  and  decoders 
needed  in  the  proposed  constant  weight  codeword  to  index 
converter.  In  the  case  of  the  tri-state  buffers,  the  array  cell  at 
the  top  of  each  column  corresponds  to  the  largest  binomial 
coefficient  in  that  column  and  thus  determines  the  number  of 
bits  needed  for  that  adder  input. 

TABLE  II 

The  number  of  tri-state  buffers  and  decoders  needed  in  the 

CONSTANT  WEIGHT  CODEWORD  TO  INDEX  CONVERTER 


n 

r 

#  of 

tri- state 

#  of 

decoders 

4 

2 

12 

9 

6 

3 

36 

16 

8 

4 

90 

25 

10 

5 

162 

36 

12 

6 

266 

49 

14 

7 

424 

64 

16 

8 

648 

81 

18 

9 

920 

100 

20 

10 

1254 

121 

22 

11 

1680 

144 

24 

12 

2184 

169 

The  longest  path  in  the  circuit  is  from  xn-\  through  the 
array  to  the  Valid  output,  and  it  is  O(n).  The  delay  of  the 
adder  can  be  neglected,  since  it  is  (if log  n).  Thus,  the  overall 
delay  is  0(n ). 


D.  FPGA  Resources  Used 

To  understand  how  the  complexity  of  a  (")  combinatorial 
number  system  constant  weight  code  to  index  converter  de¬ 
pends  on  n  and  r,  we  implemented  this  system  for  various  n 
and  r  on  the  40  nm  Altera  Stratix  IV  EP4SE530F43C3NES 
FPGA.  Table  III  shows  the  delay  obtained  and  the  resources 
used  in  this  implementation.  The  leftmost  column  shows  the 
constant  weight  code  as  a  binomial  number.  For  example, 
(g248)  corresponds  to  a  64-out-of-128  bit  code.  The  second 
column  shows  how  many  bits  in  the  output  Index  are  needed 
to  represent  the  largest  this  code.  The  third  column  gives 
the  delay  achieved,  which  is  inversely  proportional  to  the 
frequency  of  the  circuit.  The  rightmost  column  gives  the 
number  of  ALMs  needed  to  realize  this  circuit,  which  a 
measure  of  the  area.  Although  this  table  shows  only  balanced 
constant  weight  code  generators  where  the  number  of  bits  is 
a  power  of  2,  our  approach  applies  to  any  number  of  bits  and 
to  any  weight. 

TABLE  III 

Delay  and  resources  used  to  realize  combinatorial  number 

SYSTEM  CONSTANT  WEIGHT  CODE  GENERATORS  ON  THE  ALTERA 
Stratix  IV  EP4SE530F43C3NES  FPGA. 


Con.  Wgt. 

#  Bits 

Freq. 

Delay 

Est.  #  of 

Code  O 

Index 

(MHz 

(ns.) 

Packed  ALMs 

(3 

3 

261.6 

3.8 

2  (0%) 

© 

7 

178.7 

5.6 

17  (0%) 

(s6) 

14 

104.4 

9.6 

120  (0%) 

© 

30 

57.5 

17.4 

647  (0%) 

(32) 

61 

31.5 

31.7 

3.203  (1%) 

(bT) 

125 

15.2 

65.8 

20,497  (9%) 

Our  circuit  was  synthesized  using  Synplify  Pro  and  modeled 
using  ModelSim.  A  large  codeword  is  achievable;  a  64-out- 
of-128  bit  converter  uses  only  9%  of  the  available  ALMs.  The 
large  values  of  n  required  special  Verilog  programming.  For 
example,  to  implement  the  64-out-of-128  bit  constant  weight 
codeword  to  index  converter  requires  that  the  binary  value 
of  (g247)  be  applied  to  the  adder  circuit.  This  value  is  much 
too  large  for  Synplify  Pro.  To  overcome  this  deficiency,  we 
computed  the  binary  value  of  (g2/)  and  other  values  of  (") 
in  a  MATLAB  program  and  wrote  it  to  a  header  file  that  was 
included  in  the  Verilog  code. 

III.  Complex  Disjoint  Decomposition  Solution 

It  can  be  see  from  Fig.  1  that  the  longest  path  through  the 
array  of  the  constant  weight  codeword  converter  has  length 
n,  where  n  is  the  number  of  bits  in  the  constant  weight 
code.  In  computing  the  index,  each  1  contributes  a  value  that 
depends  on  the  number  of  l’s  that  preceded  it.  A  1  in  the 
leftmost  bit  position  is  an  exception  to  this.  This  1  always 
contributes  (®)  =  10.  This  can  be  seen  in  Table  I;  the  constant 
weight  codewords  with  a  1  in  the  leftmost  bit  corresponds  to 
a  combinatorial  number  in  which  the  most  significant  digit  is 
(3).  However,  a  1  in  the  second  bit  from  the  left  contributes 


a  different  value  to  its  combinatorial  number  representation 
depending  on  whether  the  leftmost  bit  is  1  or  0.  If  1,  then  the 
second  1  contributes  (f).  If  0,  then  it  contributes  (f). 

A  similar  phenomena  exists  at  the  right  side.  Interestingly, 
the  least  significant  bit,  whether  0  or  1  contributes  0  to  the 
combinatorial  number’s  value.  This  is  because  that  bit  is 
’’forced”  to  be  0  or  1  depending  on  whether  there  are  six 
or  five  1  bits  to  its  left.  However,  note  that  the  right  digit  of 
the  combinatorial  number’s  value  is  1  iff  the  least  significant 
bit  of  the  constant  weight  code  is  0.  This  can  also  be  seen 
in  the  circuit  of  Fig.  1.  Here,  if  20  is  1,  then  0  drives  the 
least  significant  digit,  and  none  of  the  other  three  binomial 
coefficients  can  drive  this  least  significant  digit.  That  is,  Xg 
and  the  only  decoder  it  drives  is  the  ’’mirror”  image  of  Xg  and 
the  only  decoder  it  drives.  Similarly,  x\  and  the  two  decoder 
it  drives  are  the  mirror  image  of  X4  and  the  two  decoder  it 
drives.  Therefore,  we  can  realize  the  same  circuit  by  reversing 
the  decoders  in  Fig.  1  that  are  driven  by  x-i,  x\,  and  Xg.  The 
new  circuit  is  shown  in  Fig.  3. 


Adder 


5 

Index 


Fig.  3.  Constant  Weight  Codeword  to  Index  Converter  Circuit  Consisting  of 
Two  Subcircuits. 

In  the  new  circuit,  the  inputs  are  divided  into  two  parts 
{25,24,23}  and  {22,23,21},  where  each  part  drives  a  sepa¬ 
rate  subcircuit.  The  two  subcircuits,  in  turn,  drive  inputs  to  the 
adder,  which,  in  turn,  drives  the  Index  output.  Such  a  circuit 
is  said  to  have  a  complex  disjoint  decomposition  (CDD). 

Table  IV  shows  the  delay  achieved  and  the  resources  used 
for  the  CDD  circuit.  The  benefit  of  the  new  circuit  is  its 
reduced  delay,  especially  in  large  circuits.  This  can  be  seen 
by  comparing  Tables  IV  and  III.  For  example,  for  64-out-of- 
128  codes,  the  delay  of  circuit  consisting  of  two  subcircuits  is 


TABLE  IV 

Delay  and  resources  used  for  the  CDD  constant  weight 

CODEWORD  TO  INDEX  CONVERTER  ON  THE  ALTERA  STRATIX  IV 

EP4SE530F43C3NES  FPGA. 


Con.  Wgt. 

#  Bits 

Freq. 

Delay 

Est.  #  of 

Code  (") 

Index 

(MHz) 

(ns.) 

Packed  ALMs 

© 

3 

262.9 

3.8 

3  (0%) 

(!) 

7 

193.5 

5.2 

16  (0%) 

(?) 

14 

127.9 

7.8 

116  (0%) 

o 

30 

80.8 

12.4 

657  (0%) 

Q 

61 

47.8 

20.9 

3,503  (1%) 

(Z8) 

125 

23.6 

42.4 

20,723  (9%) 

64%  that  of  the  full  rectangle  circuit. 

IV.  Concluding  Remarks 
Although  there  is  a  need  for  a  circuit  that  computes  an  index 
from  a  constant  weight  codeword,  we  have  not  seen  a  simple 
implementation.  We  show  a  circuit  based  on  the  combinatorial 
number  system  that  has  complexity  0(n3),  where  n  is  the 
number  of  bits  in  the  code.  Our  circuit  is  useful,  for  example, 
in  the  encoding/decoding  of  data,  such  as  between  on-chip 
and  off-chip  and  in  delay-insensitive  logic  for  asynchronous 
circuits.  It  has  only  0(n )  delay.  We  also  show  an  improvement 
that  reduces  by  about  half  the  delay  that  still  has  0(n3) 
complexity.  We  have  implemented  our  designs  on  an  Altera 
Stratix  IV  EP4SE530F43C3NES  FPGA.  This  has  shown  that 
both  circuits  are  efficiently  implemented. 
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